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Abstract — In this paper, we present an exact model for the 
analysis of the performance of Random Linear Network Coding 
(RLNC) in wired erasure networks with finite buffers. In such 
networks, packets are delayed due to either random link erasures 
or blocking by full buffers. We assert that because of RLNC, the 
content of buffers have dependencies which cannot be captured 
directly using the classical queueing theoretical models. We model 
the performance of the network using Markov chains by a careful 
derivation of the buffer occupancy states and their transition 
rules. We verify by simulations that the proposed framework 
results in an accurate measure of the network throughput offered 
by RLNC. Further, we introduce a class of acyclic networks for 
which the number of state variables is significantly reduced. 



I. Introduction 

It is well-known that linear network codes achieve the 
min-cut capacity of networks for unicast applications ||T|. 
In fact, random linear codes over large Galois fields suffice 
to achieve the min-cut capacity [2]. Random linear network 
coding (RLNC) has been shown to improve the performance 
in distributed settings with time-varying network parameters. 
In these networks, a distributed and packetized network coding 
scheme, where each node stores received packets and forwards 
random linear combinations of the stored packets when re- 
quired, was introduced in |3|. As a result, for a network of 
nodes with no buffer limitations, all arriving packets at a node 
are stored, and then used to generate new packets to send. 
Hence, there is no information loss. However, in this case, 
upon reception of a packet, a node has to determine whether or 
not the incoming packet is in the linear span of its previously 
stored packets. Further, for generating every coded packet, all 
stored packets need to be accessed. It is therefore desirable 
to have limited buffer sizes, since it limits the complexity of 
storage and coded packet generation process. Further, using 
small buffers at relay nodes simplifies practical issues such 
as on-chip board space and memory-access latency as well as 
reducing the average packet delay (4], Q. 

The problem of computing capacity and designing efficient 
coding schemes for erasure networks has been widely studied 
in the absence of buffer constraints fT], H, Q. The limitations 
posed by finite buffers were considered by ISl, specifically in 
a simple two-hop line network. Inspired by this work, in |9|, 
the authors present a Markov-chain-based approach to model 
the dynamics of the system and the packet occupancy of every 
intermediate node to approximate the performance parameters 
(throughput and latency) of a multi-hop line network with 
lossy links. Several challenges arise when extending the study 
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from a single intermediate node to a multi-hop line network. 
Results from |9| were extended to other communication sce- 
narios, such as block-based random linear coding for line net- 
works 1 10 1, and general wired networks with lossless feedback 
and random routing [11 J. However, the main challenge of 
modeling the evolution of buffer occupancy or innovativeness 
of buffer contents in general network topologies when RLNC 
is used, was not addressed in these works. 

The queueing theory framework for lossy networks with 
finite buffers of |12|, |13| attempts to model the packets of 
the network as customers, the delay due to packet loss over 
links as service times in the nodes, and the buffer size at 
intermediate nodes as the maximum queue size. However, this 
packet-customer equivalence fails to accurately model RLNC 
in general network topologies. This is due to the possibility of 
packet replication at intermediate nodes, or more generally, the 
potential correlation in the contents of the buffers of various 
intermediate nodes. This correlation or dependency between 
contents of the buffers cannot be captured directly in the 
customer-server based queueing model. 

In this paper, our objective is to study the relation between 
throughput of RLNC and the buffer sizes of intermediate 
nodes in the small buffer regime. The first and the key step 
in our approach is to derive using algebraic tools the state 
of the buffers using which the dynamics of the network can 
be completely characterized. We then derive the state update 
rules for each transmission in the network. Finally, using 
the developed state space and update rules, we obtain the 
throughput of the network using Monte Carlo simulations and 
compare the results to the actual packetized implementation 
of RLNC. We believe the proposed modeling framework is 
a significant step towards developing a theoretical framework 
for computing the throughput capacity and the packet delay 
distribution in general finite-buffer wired networks. 

This paper is organized as follows. First, we present a formal 
definition of the problem and the challenges in Section HI] 
Next, we investigate the tools and steps for modeling the 
buffer states in Section [III] We then introduce in Section |IV] 
a general class of networks for which the complexity of our 
modeling is significantly lesser. Finally, Section [V] presents 
our model validation results using simulations. Conclusions 
are summarized in Section [Vl] 

II. Problem Setup and Challenges 

Throughout this work, we model the network by an acyclic 
directed graph where packets can be transmitted 

over a link "ef = (u, v) only from the node u to v. The 
system is analyzed using a discrete-time model; each node can 



transmit at most one packet over a link in an epoch. The loss 
process on each link is assumed to be memoryless, i.e., packets 
transmitted on a link it = {u,v) € 1^ are lost randomly 
with a probability of e-^ — £(„ ,„). Note that the erasures 
are due to the quality of links (e.g., noise, interference) and 
do not represent packet blocking due to finiteness of the 
buffers. Further, the packet loss processes on different links 
are assumed to be independent. Each node v ^ V has a 
buffer size of m„ packets with each packet having a fixed 
size. Source and destination are assumed to be able to store 
an infinitude of packets. Throughout this paper, node s and 
node d represent the source and destination nodes, resp. Also, 
for any x G [0,1], a; 1 — x. The unicast information- 
theoretic throughput is also defined as the expected rate (in 
packets/epoch) at which information packets are transferred 
from the source to the destination when the network is in 
steady-state. In other words, if is the time it takes for k 
information packets to be transmitted to the destination, the 
throughput capacity is given by 

C{d) = lim {Tk)-^k. (1) 

There are two key challenges in finite-buffer networks. The 
first challenge is the choice of optimal buffer management 
strategy, which also depends on the routing/coding scheme 
that is in use. Due to losses on links, and finiteness of buffers, 
transmission of a packet by a node u on it — {u, v) does 
not guarantee successful reception by the node v. Thus, in 
the absence of any feedback, a node u does not know if it 
can delete a packet from its buffer to make room for its next 
incoming packet. Further, it is also unclear if transmitting a 
packet via several parallel paths will increase the throughput. 
The second challenge is due to the possible replication of 
packets in the network. Hence, it is neither possible to model 
the system dynamics by a simple queueing model where 
packets are customers and the buffers as queue sizes, nor is it 
feasible to treat the packets as flows in the network. 

Random Linear Network Coding (RLNC) attractively by- 
passes these two challenges. It eliminates the need for a 
feedback strategy to delete stored packets because the physical 
act of storing a packet becomes immaterial. It also eliminates 
the need for active replication by allowing transmitted/stored 
packets to be treated as elements of an abstract vector space. 
This makes RLNC a favorable choice for practical schemes in 
finite-buffer scenarios. 

We consider the following packet-coding scheme introduced 
in [81, which is a finite-buffer adaptation of RLNC. In this 
scheme, at each epoch, random linear coding is used for 
both the packet generation and storage by intermediate nodes. 
As an example, consider a node u of buffer size m„. At a 
given epoch, u generates an encoded packet by performing 
a random linear combinations of to„ stored data packets 
(over a sufficiently large Galois fielcQ Fg), and transmits 
the coded packet on an outgoing link. For storage, when a 
packet successfully arrives at a node v, the node multiplies 
the received packet by a random vector chosen uniformly from 
F™" , and adds the resultant vector components to each of the 
present buffer contents. 

Therefore, using RLNC, after just a single packet reception, 
the entire buffer becomes physically full with multiples of the 

'The size of the Galois field needs to be sufficiently large to increase the 
chance of innovativeness of the coded packet. 



received packet. Thus, even though the buffer of the node 
u is almost always physically full, the number of stored 
packets that is innovative w.r.t any other subset of nodes 
can vary from to m„. As an example, suppose that two 
nodes a and b receive/store two packets each generated from 
three original packets from a relay c. In this case, a and 
b will have two innovative packets each for the destination. 
Now, suppose a delivers a packet to the destination. Then, 
b still contains two innovative packets for the destination. 
However, if a delivers another packet to the destination, b 
will only have one innovative packet for the destination, 
since both nodes together originally possessed only three 
innovative packets for the destination. In this example, the 
challenges of tracking the number of innovative packets and 
the interdependency between buffer contents gets compounded 
further as the packets from a and b are propagated to the 
other intermediate nodes. This interdependency between buffer 
contents signals the need for a novel notion of occupancy 
to track the number of innovative packets each node has for 
the destination, and consequently, to determine the throughput 
capacity of the network. This notion will be formalized in the 
following section. 

The main motivating factor to develop a theoretical model 
for these networks is to understand the throughput capacity 
under RLNC. In order to measure the throughput of RLNC in 
these networks, one option is to perform a Monte Carlo simu- 
lation where encoded packets are generated using coefficients 
in a large finite field ¥q, and buffer updates are performed 
upon each successful reception. This is a significantly time- 
consuming simulation due to large field operations. A theoret- 
ical model that tracks buffer dynamics based on occupancy of 
buffers will be a simpler alternate means. As we will see, the 
developed model provides a more efficient way of measuring 
the performance of finite-buffer networks. Additionally, it 
provides us with intuitive insights on the dynamics of buffer 
updates, which is a major step towards computing performance 
metrics for such networks, and analyzing their key trade-offs. 

III. Exact Modeling of Finite-buffer RLNC 

Here, we introduce the tools and steps that enable us to 
track changes in the buffer contents of nodes. 

To identify the throughput as defined in ([T]), we assume that 
the source possesses a sufficiently large block of packets that 
has to be transmitted to the destination. The first aim is to 
formalize the notion of buffer occupancy by investigating the 
dimension of the span of the stored packets in the buffers. 
Let {Ti,T2, ■ . ■ ,Tk} be the original information packets at 
the source. Let [n] ^ {1,2, denote the set of all 

intermediate nodes, where n = \V\ — 2. Let Pi.j{t) be the 
packet contained in buffer slot j of relay i at time epoch 
t, where Pij{t) = Ya^iO-i^j^iTu i £ [n], j E [m^], and 
ai,j^i is a coefficient in the chosen Galois field F,. Let 
V(S')(t) = span{Pij(i)| j e [m,],i e S} for all S C [n]. 
To simplify the notations, we will drop the reference to time 
in V{S){t) by using V{S). Also, we define S" = \ S. 

Definition 1: For any two subsets of the intermediate nodes 
S", S' C [n], we define the innovativeness of S w.r.t. S' at time 
instant t as: 

Is^S' = dim {V{S)) - dim {V{S) n V{S')) . (2) 



In other words, Is^s' gives the number of innovative packets 
that buffer contents of nodes in S can generate which cannot 
be generated by the contents of the buffers of nodes in S'. 

Definition 2: The occupancy vector {bs}sc[n] of the net- 
work is defined to be 

bs = dim {V{S)) - dim {V{S) n V(S"=)) , S C [n]. (3) 

The following lemma shows that the knowledge of occupancy 
vector {&s}sc[n] is equivalent to knowing the innovativeness 
of any subset of the relay nodes w.r.t. any other subset. This 
result significantly reduces the number of state space variables. 

Lemma 1: For 5,5" C [n], Is^s' — bs'" — ^fsus'}"^- 

Proof 1: Proof omitted due to lack of space. 

Since the occupancy vector provides the innovativeness of 
the contents of each node w.rt the remaining nodes, we need 
to be able to track the dynamics of the occupancy vector 
for successful transmissions on links to complete the system 
modeling. To do so, let superscripts — and + denote the status 
of a system parameter before and after a successful packet 
transmission on a link. The following results derive the rules 
for updating the occupancy vector when successful transmis- 
sions occur. Throughout these results, we denote whplwlp to 
qualify an event if its probability of occurrence can be made 
arbitrarily close to unity/zero by increasing the field size alone. 

Lemma 2: (Source-to-Relay) The update rules when a relay 
i successfully receives a packet from s are as follows whp. 

• If i S 5 C [n] and b^ij < mi, then = ^5 + 1- 

• If i ^ 5 C [n], < rrii and /j^j^^^c^jj} = "THi, then 
6+ = 65 + 1 

• Otherwise, bg = bg. 

Proof 2: Proof omitted due to lack of space. 
Lemma 3: (Relay-to-Relay) The update rules when relay j 
successfully receives a packet from relay i are as follows whp. 

. If i G 5 c [n], j e 5^ I{j}^s-=\{j} < 
%^s^ > 0' *en b+ = bs- 1. 

• Otherwise, 6^^ = ^5- 
Proof 3: See Appendix lAl 

Lemma 4: (Relay-to-Destination) The update rules when d 
successfully receives a packet from relay j are as follows whp. 

• If i G 5 C [n] and -f{~}^5c > 0, then bg=bg- 1. 

• Otherwise, 6j = ^5- 

Proof 4: Proof omitted due to lack of space. 

On the whole, an update of buffer occupancy occurs only 
when the delivered packet is innovative for the receiving node 
and the buffer of the receiving node is not full. Next, we 
describe how the state update rules could be utilized to obtain 
the throughput of a network. Let 1^* — (lfi,...,"^|-g|)bean 

ordering of the edge set i^, and let l{t) G {0, l}'^' represent 
the realization of the channels at time t. That is li{t) — 1 
if the i* edge in 1^* does not erase the transmitted 
packet during the epoch t. Then, given the occupancy vector 
{bs{t)}sc[n] ™d the channel realization l{t), the occupancy 

^The precise definition of the occupancy vector must consider the packets 
that have already reached {d} by using bg = dim(V(S)) — dim(V(5) fl 
V(S"^ U {d})). However, the inclusion of {d} affects update rules only when 
dealing with the destination. For simplicity, the equivalent definition without 
the inclusion of {d} is used in all cases not involving the destination. 



vector {bs{t + l)}5c[n] can be determined using the state 
update rules presented in Lemmas [21 E] |4] 

Further, the state transition probability matrix T for the 
corresponding Markov chain can be identified as follows. Also, 
let T-^ be the state transition matrix given a successful packet 
transmission on the link ~e^. For any "f?" G i^, T-^ can be 
determined using Lemmas |2] |3] |4] Therefore, 

E ( n -^0(n^^.^^0- 

ie{0,l}l'^l i:ii=0 l:li = l 

This Markov chain can be proved to be irreducible, aperi- 
odic, and ergodic |l9l, |fT4'|. Therefore, it possesses a unique 
steady-state probability distribution. Moreover, due to ergodic- 
ity, the time averages are equivalent to the statistical averages. 
Therefore, the throughput capacity C{C^) can be determined 
using the steady state probability of the event that the network 
is in a state wherein the nodes possessing a link to the 
destination have innovative packets as follows. 

C0)= ^a,{&sW})-Pr({fo5W}), (5) 

ie{0,l}l^l, {bs(i)} 

where ^{l,{bs{t)}) represents the number of successfully 
transmitted packets when state {bs{t)} and channel reahzation 
I occur together. 

IV. State Size Reduction in a Class of Networks 

In SectionHni we observed that the number of state variables 
that we need to track at each time epoch is 2" — 1 since 
bs, the innovativeness of every subset of relay nodes w.r.t. 
its complement, must be considered. In this section, we show 
that all innovativeness terms need not be tracked to completely 
define the state of the system. This is a consequence of the 
intuition gained in Une networks ||9l- In line networks, we need 
to only track /,;_j.5, where 5 = {« + !, • • ■ , n], i.e., all those in- 
termediate nodes that are farther from the source hop-distance- 
wise. Equivalently, for line networks, it suffices that we track 
bs for 5 = {1, • • • for i G [n]. Extending that intuition, 
define jz/ = {5 C [n] : Every j G 5"^ has a path in S'^ to d} 
as illustrated in Fig. |2] Consider a partition of the set of relay 




Fig. 1. Illustration of a set S in si^. 

nodes into types {Hi,H2, . ■ ■}, where a relay node v belongs 
to Hk if the shortest hop-distance from v to the destination d 
is k, and = {d}. Define a class of networks Af where every 
link starts at some node in Hi for some i and ends at some 
node in Hi-i. Figure |2] illustrates a network from this class. 
This structure enables us to track significantly lesser number of 
innovativeness components using the following result, which 
shows that tracking the occupancy for sets in ^ suffices to 
define the system completely. 




V. Simulation Results 

In this section, we present the results of our performance 
modehng framework using state update rules in comparison 
with an actual packetized implementation of RLNC, and 
will show that our framework accurately models the buffer 
dynamics of the network. 

We consider Network 1 and Network 2 shown in Fig. [3] 
to compare the results of our simulations. In Network 1, the 




(a) Network 1. (b) Network 2. 



Fig. 3. Networks Considered for simulation 

edges have erasure probabilities £(s.i) — 0.1, £(1.2) — 0.6, 
£(1,3) = 0.5, £(2,4) = 0.4, £(3,4) = 0.5, and £(4,<j) = 0.1. 
In Network 2, all the edges have £ = 0.5 except the edges 
{(s,l),(s,2),(5,d),(6,d)} for which e = 0.25. All the 
intermediate nodes are assumed to have the same buffer size. 
In order to measure the exact performance parameters of this 
network, a block of size k = 10^ packets is sent from the 
source to the destination. Fig. |4] and Fig. |5] present the 




Buffer size (packets) 



Fig. 4. Throughput of Network 1 for different buffer sizes. 

variations of the throughput measured by actual simulation of 
RLNC and the throughput measured by simulation based on 
the state update rules developed in our work versus the buffer 
size. As it can be observed, our model is very close to the 
actual simulation results. Further, it confirms the optimality of 
RLNC for the infinite buffer setting as the curve approaches 
to the min-cut capacity for both networks. It is notable that 



the emulation of the RLNC using the derived state update 
rules takes significantly lesser time than the exact simulation 
of the RLNC scheme. Table U compares the number of states 

TABLE I 

Number of active states vs. buffer size in Network I. 



Buffer Size 


No. of Active States 


Upper Bound (m + 1)''' 


1 


44 


32768 


2 


600 


14348907 


3 


4358 


1073741824 



actually visited (identified by simulations) and a crude upper 
bound on the number of states in the Markov chain model. 
For Network 1, the number of state variables is 2"* — 1 = 15, 
and a provable upper bound for the number of states is 
(m + 1)^^, where m is the buffer size of each intermediate 
node. However, it is noticed from simulations that the number 
of states that is actually realized is much lesser than the 
bound. This observation signals suggests that a closer look 
at the Markov chain to reduce its size can simplify the model, 
thereby rendering it more easily tractable. 

VI. Conclusion and Future Work 

We have derived a novel notion of buffer occupancy for 
RLNC in wired finite-buffer networks. Using this notion, we 
developed a Markov-chain-based framework that can identify 
the throughput offered by RLNC using Monte Carlo simula- 
tions. This framework offers significant computational benefits 
over a complete simulation of RLNC. Though the size of the 
Markov chain is exponential, simulations suggest that a very 
small portion of the state space is actually visited in reality. 
A closer look at the state space and a thorough analysis to 
reduce the state space needs to be performed to eventually 
derive analytical throughput estimates. 
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Appendix A 
Proof of Lemma[3] 

From Definition |2] it is clear that if i, j e 5, then 6^ = 
The same appUes when i,j G S". For the case i £ S'^,j £ S, 
the update rule is = bg and the proof is similar to the one 
presented for the case i £ S,j E S'^, which is as follows. 

Hence, here we only assume i G S,j G S'^. Let = 

{Ar,^2",...,A„j, B- = {i3r,s2-,...,S|B-|}, c- = 

{Cr, C2~, . . . , C- J and P- = {Z?r, . . . , be the 

buffer contents of relay i, relays S \ {i}, relay j, and relays 
S'^ \ {j} before packet transmission, respectively. Suppose 
packet E = J27=i '^i^i successfully transfers from relay i to 
relay j. Then, for any 5* C [n]. We will have = , = 
B-, 2?+ = 2?-, and C+ = {Cf +l3iE, Cj" ■^I32E, . . . , C,-^ + 
PnijE}. Note that the coefficients ai and f3k are chosen 
randomly from F^. Let Q" — span{^~} n span{C^ U V^}. 
We consider two cases: 

• Case 1: Suppose there exists A;, Ok such that A; 7^ for 
at least one I and J^i ^i^i^ + X^fe ^k^k = 0. Hence, 

+ 0kD^ = iY hPi)E G span{C+ U V+} 

I k I 

Therefore, E G span{C+ U 2?+} whp. Further, if ^ 
span{^"}, then E ^ whp, and span{C+ U P+} 
spanjC" U 2?" U {E}}. Hence, 

b\ = dim(span{^- US"}) 

- dim(span{^- U B"} n span{C" U U {£'}}) 

Note that Q ^ span{^ } ^ ^{i}^S'= ^ ^' ^"'^ 
existence of such Xi^Ok ^{j}^S''\{j} ^ 
On the other hand, if = span{^^}, then E E 
and since ~ , we will have 6j = &5- 

• Case 2: Suppose no such A; , 9k as in Case 1 exist. Let 
T- ={Fr^i G [|J'"|]}beabasisforspan{^-U6"}n 
span{C- U V-} with F[- = ^, 7,^^," + Yk' ^'D^,- 
Also, let 7"+ = {F+,F+,..., F+,_ | }, where 

F+ = Fr + {Y^ME, /G{1,2,...,|^-|}. (6) 

fe 



Note that F+ G span{^+ U B+} n span{C+ U V+}. 
Suppose X G span{^+ U B+} D span{C+ U 2?+}, then 
there exists representations of x as follows. 

2^ = E '^fc^fc +E ^^^'^fc' = E ^licr+m+Y. c^' at 

Therefore, we have 

X - CY £ span{^" UB"} nspan{C" U2?"} 

^x-iY mE = E ^'^r = E ^'(^^^ - E ^me) 

I Ilk 

Therefore, 

x~Y ^iF^ = ( E ^if^i - E ^nikPk) E = ^{x)E (7) 

I I k,l 

We consider two cases here. 

Sub-case 2a: First, suppose that $(a;) — for all x G 
span{y^l+US+}nspan{C+U2?+}. Hence, spanlJ'+j = 
span{^+ U B^} n span{C+ U 2?+}. Next, we prove 
that members of are linearly independent. Suppose 
Y,i t^iFf^ = 0, then by dSI, 

E^;^^r = (E^'^'*^-'^fe)^ 

; i,k 

Here, if ^ span{^^}, then E ^ whp, and 
F^ are linearly independent, again whp. On the other 
hand, if Q~ = span{^^}, then E G F~ can be 
uniquely represented as a linear combination of F^ , 
i G [|^^|]. Let E = J2i'4'iFi'- Given a particular 
value of {oji,--- ,uj^jr-^) ^ 0, due to the randomness 
of the /3fe's, the probability that 'Ylii^iFj^ — happens 
is equal to which can be made as small as required 
by choosing a large field size. 

Thus, F'^ are linearly independent in this case. Therefore, 

dim(span{^+U;B+}) = dim(span{i^+}) = dim(span{F^}). 

Therefore, the update rule will be &J — 
Sub-case 2b: suppose that $(a;) ^ for some x G 
span{^+ U n span{C+ U 2?+}. Then, from O, 
E G span{y^l+ U B+} n span{C+ U 2?+}. Now, if 
— span{^^}, then E G span{C^U2?~} which means 
that span{C+U2?+} = span{C" U2?"}. Thus, the update 
rule in this case is given by 6^ = b'g. On the other 
hand, if ^ span{y^"}, then E ^ span{C" U V^}. 
However, by ©, G span{C+ U 2?+}. Hence, there 
exists a representation of E as follows 

E = Y,MCr +PiE)+Y,'Pi'Di, (9) 

^ [ 1 - ] s = E ^'^r + E ^^-^v- do) 

\ I J I v 

Given that E ^ span{C" U2?"}, it follows from ^ that 
X); TT;/?; = 1 which implies that 

E^'^r+E^''A' =0- (11) 

I I' 

However, in Case 2, there cannot be an equation of the 
form (fTTT i. unless we have tt; = for all I. Substituting 
TT; = in (|9|l results in a contradiction. Thus, Sub-case 
2b occurs wlp. ■ 



